AITopics | empirical study

Collaborating Authors

empirical study

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Materials: An Empirical Study of Adder Neural Networks for Object Detection

Neural Information Processing SystemsApr-25-2026, 11:37:54 GMT

As discussed in prior literature [1, 4], one operation of floating-point addition and multiplication have energy costs of 0.9 pJ and 3.7 pJ, respectively. Meanwhile, one operation of 8-bit integer addition and multiplication have 0.03 pJ and 0.2 pJ energy costs, demonstrating much lower cost than floating-point operation. Therefore, it is important to explore whether adder detectors performs well for INT8 quantization. We tried to adopt INT8 post quantization for our Adder FCOS (B+N) model, which suffers 0.8 mAP drop compared with full precision model, as shown in Table A. The energy reduction further increases from 29% to 35%. Note that post training quantization is not optimal for INT8 models, and quantization-aware training may greatly further improve the accuracy.

artificial intelligence, detector, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback

Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning

Neural Information Processing SystemsMar-21-2026, 03:44:11 GMT

Transformer-based large language models (LLMs) have displayed remarkable creative prowess and emergence capabilities. Existing empirical studies have revealed a strong connection between these LLMs' impressive emergence abilities and their in-context learning (ICL) capacity, allowing them to solve new tasks using only task-specific prompts without further fine-tuning. On the other hand, existing empirical and theoretical studies also show that there is a linear regularity of the multi-concept encoded semantic representation behind transformer-based LLMs. However, existing theoretical work fail to build up an understanding of the connection between this regularity and the innovative power of ICL. Additionally, prior work often focuses on simplified, unrealistic scenarios involving linear transformers or unrealistic loss functions, and they achieve only linear or sub-linear convergence rates. In contrast, this work provides a fine-grained mathematical analysis to show how transformers leverage the multi-concept semantics of words to enable powerful ICL and excellent out-of-distribution ICL abilities, offering insights into how transformers innovate solutions for certain unseen tasks encoded with multiple cross-concept semantics. Inspired by empirical studies on the linear latent geometry of LLMs, the analysis is based on a concept-based low-noise sparse coding prompt model. Leveraging advanced techniques, this work showcases the exponential 0-1 loss convergence over the highly non-convex training dynamics, which pioneeringly incorporates the challenges of softmax self-attention, ReLU-activated MLPs, and cross-entropy loss. Empirical simulations corroborate the theoretical findings.

large language model, natural language, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

456048afb7253926e1fbb7486e699180-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 01:13:06 GMT

extension, function evaluation, reviewer, (13 more...)

Neural Information Processing Systems

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

2 Neuralnetworkensemblesandtheirrelationstokernels

Neural Information Processing SystemsFeb-11-2026, 02:36:22 GMT

Although the ongoing success of deep learning is remarkable, the increasing data, model and training algorithm complexity makeathorough understanding oftheir inner workings increasingly difficult.

artificial intelligence, ensemble, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

95e62984b87e90645a5cf77037395959-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 10:14:30 GMT

finetune task, influence function, reviewer, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Quality (0.56)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

highlight the novelty of our work along other axes: 4

Neural Information Processing SystemsFeb-8-2026, 18:31:23 GMT

This is in contrast with past studies failing to scale DFA past simple datasets like MNIST.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

Resetting the Optimizer in Deep RL: An Empirical Study

Neural Information Processing SystemsDec-27-2025, 01:24:22 GMT

We focus on the task of approximating the optimal value function in deep reinforcement learning. This iterative process is comprised of solving a sequence of optimization problems where the loss function changes per iteration. The common approach to solving this sequence of problems is to employ modern variants of the stochastic gradient descent algorithm such as Adam. These optimizers maintain their own internal parameters such as estimates of the first-order and the second-order moments of the gradient, and update them over time. Therefore, information obtained in previous iterations is used to solve the optimization problem in the current iteration. We demonstrate that this can contaminate the moment estimates because the optimization landscape can change arbitrarily from one iteration to the next one. To hedge against this negative effect, a simple idea is to reset the internal parameters of the optimizer when starting a new iteration. We empirically investigate this resetting idea by employing various optimizers in conjunction with the Rainbow algorithm. We demonstrate that this simple modification significantly improves the performance of deep RL on the Atari benchmark.

iteration, name change, resetting, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

An Empirical Study Towards Prompt-Tuning for Graph Contrastive Pre-Training in Recommendations

Neural Information Processing SystemsDec-26-2025, 18:08:44 GMT

Graph contrastive learning (GCL) has emerged as a potent technology for numerous graph learning tasks. It has been successfully applied to real-world recommender systems, where the contrastive loss and the downstream recommendation objectives are always combined to form the overall objective function. Such a strategy is inconsistent with the original GCL paradigm, where graph embeddings are pre-trained without involving downstream training objectives. In this paper, we innovatively propose a prompt-enhanced framework for GCL-based recommender systems, namely CPTPP, which can fully leverage the advantages of the original GCL protocol through prompt tuning. Specifically, we first summarise user profiles in graph recommender systems to automatically generate personalized user prompts. These prompts will then be combined with pre-trained user embeddings to conduct prompt-tuning in downstream tasks, thereby narrowing the distinct targets between pre-training and downstream tasks.

empirical study, graph contrastive pre-training, prompt-tuning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.86)

Add feedback

Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

Neural Information Processing SystemsDec-25-2025, 16:01:40 GMT

We might hope that when faced with unexpected inputs, well-designed software systems would fire off warnings. Machine learning (ML) systems, however, which depend strongly on properties of their inputs (e.g. the i.i.d.

detecting dataset shift, empirical study, name change, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

empirical study

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Supplementary Materials: An Empirical Study of Adder Neural Networks for Object Detection

Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning

456048afb7253926e1fbb7486e699180-AuthorFeedback.pdf

2 Neuralnetworkensemblesandtheirrelationstokernels

95e62984b87e90645a5cf77037395959-AuthorFeedback.pdf

highlight the novelty of our work along other axes: 4

322f62469c5e3c7dc3e58f5a4d1ea399-AuthorFeedback.pdf

Resetting the Optimizer in Deep RL: An Empirical Study

An Empirical Study Towards Prompt-Tuning for Graph Contrastive Pre-Training in Recommendations

Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift